home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Shareware Overload Trio 2
/
Shareware Overload Trio Volume 2 (Chestnut CD-ROM).ISO
/
dir35
/
cff51b.zip
/
CFF.DOC
next >
Wrap
Text File
|
1993-10-23
|
58KB
|
1,711 lines
Culvers Fabulous Filesystem
Version 5.1
Beta Release
October 23, 1993
Copyright 1991, 1992, 1993 Norman D. Culver
All Rights Reserved
Send bug reports and suggestions to:
70672.1257@COMPUSERVE.COM Internet
or
Oxbow Software
1323 S.E. 17th Street #662
Ft. Lauderdale, FL 33316
(305) 527-1663 Phone
(305) 760-7584 Fax
This software package is distributed as a preliminary version for
TESTING PURPOSES ONLY, WITHOUT ANY WARRANTY;
without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
GENERAL DESCRIPTION
This distribution contains a single threaded 32 bit portable library
compiled with gcc for the Intel 386 cpu. The library contains external
references to functions in 'cfport.c' which is supplied as source code.
The user can modify cfport.c to accommodate a non-unix like OS.
The user API for the library is declared in the file 'cff.h' and described
in this document. CFF is a filesystem/database engine which handles
allocation of memory, extended memory and disk. It supports incrementally
hashed and B+ tree storage maps in an identical fashion for memory and
disk. The standard flavors of malloc are included. NEW for the first time,
to my knowledge, is the ability to malloc by category and the ability
to malloc PERMANENTLY.
Data is stored in and accessed through 'objects' which are referenced
in the classical filesystem manner. e.g. MEMORY/myobject/subobject/...
Objects are opened and closed in the time honored unix fashion. But there
is a big difference between CFF objects and normal filesystem objects.
Each object has the properties of a directory, stack, file, index and
repository. Objects can be created as F_SORTED, in which case they are
B+ trees, and/or with their own local 'bitmaps' for locality of reference,
or, if hashed, with PREALLOCATED DATA AND ENTRIES for ultra fast inserts.
The objects expand and shrink automatically as entries are added or deleted.
The in-core index for hashed objects is never saved to disk because it can be
rebuilt when the object is loaded. Data can be accessed directly in the
localizer buffers or copied to/from user space.
COPY OBJECTS
The 'cfcopy' command will move an object and all subobjects within and
between filesystems. A filesystem on disk is denoted with the extension
'.cff'. The filesystems for memory and extended memory
(if it exists) are predefined as MEMORY and EXTDMEM. The copy command
will create a filesystem on disk if an object is copied to a '.cff' object.
The command:
cfcopy("newsys.cff", "MEMORY/object/subobject/target");
copies the object named 'target' to the disk file newsys.cff and thus
'target' becomes a filesystem. Conversely, filesystems can become objects, etc.
PATHNAME TRANSLATION
A pathname translation facility permits brevity.
Translation dictionaries are maintained for process, application and system.
Example:
void *h1, *h2;
cfdef("phone book", "myfile.cff/users/applications/faxmodem/phones");
h1 = cfopen("phone book", F_RDWR, NULL);
or
h2 = cfcopy("MEMORY/tempobj", "phone book");
or
h2 = cfcopy("MEMORY/tempobj", h1);
And when finished you can get a fully garbage collected and shrunk disk object
with the command:
cfcopy(h1, h2);
or
cfcopy("phone book", h2);
or
cfcopy("phone book", "MEMORY/tempobj");
EXTERNAL FILES
It is possible to access normal OS files and directories. If a path doesn't
reference a .cff filesystem then it is assumed to be external. Copy to/from an
external file only exercises the file property of an object, so it is
trivial to import/export external data.
FANCY MALLOCS
The CFF library supplies standard versions of malloc, calloc, realloc etc.
which MUST superceed the standard compiler versions. Don't worry, these
implementations are fast and debugged. This BETA version of the library
does not include debugging support for when you corrupt the heap, it will be
in the commercial version when it is released.
The enhancements to malloc are simple to use but rather complicated to
implement. They permit the programmer to malloc, calloc, free etc. by
CATEGORY and they are FAST. Category 0 is reserved for normal malloc, and
is somewhat segregated in the address space because it calls 'sbrk()'
directly. All other categories call cfmalloc() which does it's own calls
to sbrk (actually PORTSBRK in cfport.c). NOTE: Normal malloc, etc. will work
even if cfinit() is not issued. This is nice.
Examples:
void *ptr;
ptr = mallocC (5, 1024); // malloc 1K bytes for category 5
freeC (5, ptr); // free the pointer in category 5
freecat (5); // free all memory allocated to category 5
Permanent categories have negative numbers.
ptr = malloc (-20, 1024); // malloc 1K bytes for permanent category -20
freeC (-20, ptr); // free the pointer in category -20
freecat (-20); // free all memory allocated to category -20
Permanent categories are enabled only if the programmer has named a
filesystem when 'cfinit' is called.
main()
{
cfinit("app1", 512, "myfile.cff"); // enables permanent categories in myfile.cff
or
cfinit("app1", 512, NULL); // permanent categories are disabled;
...
cfexit(); // saves active permanent categories and closes objects
}
Data in active permanent categories is reloaded to exactly the same memory
addresses at program startup and saved in the named filesystem at program exit.
Creep in the data segment as program development proceeds is accommodated
by setting the variable 'heapstart' in cfport.c. The programmer should
pick a number large enough to accommodate the environment in which development
is taking place. If C++ is used, static constructors can gobble a lot of
space before cfinit() is called. The command cfcreep() returns the amount
of space in the safety zone. WARNING: you can't change heapstart after
permanent categories have been saved in a file.
The named filesystem can hold permanent categories for multiple applications
providing the applications are uniquely identified by the first argument
to cfinit().
In a clean design, a programmer would save a small number of pointers (one??)
to a complex memory structure which will come and go automatically as the
application is run. The purpose here is to speed up the process of loading
permanent objects. Some CAD programs take 30 minutes to start; with CFF
startup can be reduced to seconds on similarly endianed machines.
for example:
main()
{
struct masterstruct *masterptr;
cfinit("myapp", 256,"myfile.cff"); // PERMINFO is now defined
if(!cfisnew(PERMINFO))
{/* The file is not new, load the saved copy of masterptr */
cfget(PERMINFO, "masterptr", 9, &masterptr, sizeof(void *));
}
else
{/* The file is new, build the basic structures */
masterptr = mallocC(-1, sizeof(masterstruct)); // or whatever
... // build memory structures based on masterptr
// and various negative categories
}
... // use and modify the permanent structures
// always use negative categories
/* Save the master pointer before exit */
cfreput(PERMINFO, "masterptr", 9, &masterptr, sizeof(void *));
cfexit();
}
PREDEFINED PATHNAMES (use as the leading element of a path)
MEMORY the memory filesystem
EXTDMEM the extended memory filesystem
PREDEFINED PATHNAMES or HANDLES (use as the leading element or as a handle)
PERMFILE the filesystem which was mentioned in cfinit()
PERMCAT an object in PERMFILE containing the app's permanent categories
PERMINFO an object in PERMFILE into which the user can store app info
ALIGNMENT
The granularity of the BETA release is 16 bytes for malloc and 32 bytes
for filesystem allocations. I am considering allowing each bitmap to
carry a different alignment, if all programmers used gcc then I could
use long long arguments in the user API.
NODE SIZES
Node (bucket) sizes can range from 512 bytes to 8192 bytes. When an
object is created, the default size is 2048 bytes. The programmer can
modify this size by setting a flag in the mode bits for cfopen. The
root directory of a filesystem is defaulted to 1024 bytes per node.
F_FILEONLY 512 bytes
F_BIGDIR 4096 bytes
F_HUGEDIR 8192 bytes
KEYS, ITEMS and DATA
Keys are the names of things in the filesystem. When used as part of a
path the key must contain only characters in the range 0x40 to 0x7f
(space to tilde) or it will be rejected by the name translation mechanism.
In all other cases a key may contain anything. When using hashed objects
the programmer should make a distinction between long and short keys.
Short keys are 8 bytes or less and are stored directly in the hash buckets
along with Items. Long keys are stored in a special KEYCACHE area and
require the overhead of the hash bucket entry which is 20 bytes plus
the overhead of the KEYCACHE which is 13 bytes + the size of the key.
Note: the minimum KEYCACHE allocation is 32 bytes. When a long key is
stored in a hash bucket the 8 byte key region is converted to an alternative
hash value for the key, this ensures that only one key comparison
is ever done. B+ tree objects store the keys and items together in the
tree nodes. The programmer should ensure that at least 2 keys will fit
into a B+ tree node, 1 key per node results in a B tree.
The key comparison routine can be set for each object with the command
cfkeycmp(handle, funcaddr);
Items are the basic insertable/retrievable elements in a node. They are
64 bits in size with the high 4 bits reserved for tags. B+ trees can support
untagged items if created with F_UNTAGGED set.
The programmer will find that 64 bit items are a bit of a pain, especially
if his compiler does not support the long long type. They are worth the
extra effort, believe me.
When the system stores a tagged item it checks the high 4 bits, if they
are non-zero it assumes that a tag is already present, otherwise the
item is tagged as a STO_VALUE. Be SURE that your program generated items
have the high 4 bits set to zero.
Due to the tagged nature of most items, the system can automatically allocate
and deallocate storage if the item happens to describe a chunk of data.
Currently the maximum size of a single contiguous chunk is 16 Megabytes -1.
This can change if I implement adjustable alignment per bitmap.
See the file cff.h for tag values.
The item comparison routine can be set for each object with the command
cfitemcmp(handle, funcaddr);
A copy of the default keycmp and itemcmp functions are located in the
module cfport.c so that the programmer can see what needs to be done for
alternative functions.
HASHED vs SORTED
The default method for an object is hashed. The root directory of a
filesystem is forced to be hashed even if created as F_SORTED. Thus
a SORTED object cannot become a filesystem with the cfcopy command.
I may change this but it complicates opening a filesystem.
SORTED pros
1. Sequential access of a sorted object produces sorted output.
2. Sorted objects can support untagged items, no subobjects are allowed.
3. Normal mode duplicate entries have the overhead of only one key per node.
4. An unlimited number of normal mode duplicate entries is supported.
5. There is no in-core index for the nodes. This could be important for
an object with 100 million items.
6. Inserts and deletes to sorted objects do not invalidate the mark.
SORTED cons
1. Access and update of sorted objects is MUCH slower than hashed.
2. Key length is limited by the node size.
HASHED pros
1. Hashed objects are fast to update and access, and VERY fast with short keys.
2. Key length is unlimited (16 Megabytes).
3. Hashed objects can be created with PREALLOCATED entries and data. This
produces complete locality of reference and great speed when creating
a database.
HASHED cons
1. The in-core index for each object takes up space (4 bytes per bucket).
This is a consideration when the object contains a lot of items. The
maximum bucket of 8192 bytes contains 406 items.
2. Normal mode duplicate entries is limited to 406 dups.
3. Under pathological conditions the hashed object may fill up, i.e. a
bucket may contain a mix of keys which cannot be split.
4. Sequential access to a hashed object produces unsorted output.
DUPLICATE ENTRIES
Keys with duplicate entries are supported in two ways: the normal
'dupnum' way and the 'dupname' way. Dupnames are 48 bit unique names + a 12 bit
unique id. Each object can support and control 4093 dupname sets.
Dupnames can be used as keys or items. Normal dups must be accessed by
a key-item combination or sequentially; the items for each normal
dup should be different.
DO NOT MIX NORMAL AND DUPNAME DUPLICATES.
OBJECTS AS FILES
Any object is a file, just read and write to it. If you read
before writing, read will return ERROR. When an object is closed
space allocated to the file property is truncated to within 128 bytes
of the filesize, filesize is limited to 2G bytes.
OBJECTS AS STACKS
Any object is a stack, just push and pop items or data. Stacks retain
their data when an object is closed and/or copied, stack depth is unlimited.
OBJECTS AS DIRECTORIES
Any object is a directory, just create a subobject (subdirectory) with
the cfopen or cfsubopen commands. The directory property can be accessed
with cfopendir, cfreaddir, cfrewinddir, cftelldir, cfseekdir and cfclosedir.
OBJECTS AS INDEXES
Any object is an index, just insert, find and delete keyed items.
To find out what is in an object, issue the command cfprintentries(something).
OBJECTS AS REPOSITORIES
Any object is a repository, just put and get keyed data chunks
(max 16Meg per chunk, min 32 bytes per chunk).
INTERNAL BUFFERING
This BETA version supports 512 buffer headers and unlimited data
buffering. The cfinit command includes an argument for the number of 1K blocks
of data buffering allowed. cfinit("app",700,NULL) designates that 700K of
memory be allocated to buffering.
Data chunks can be accessed directly in the buffer region with the command
cflocalize(). The localizer will allow up to half of the defined
data region to be allocated to a single localization request, excess
buffers are flushed. The size of the data region may be increased/decreased
dynamically with the cfmodbufs() command.
This localizer does not localize blocks; it localizes chunks which may
be large or small. Most of the time the localizer is working with nodes
and keys so the memory reqirements are not large internally. The file
property is read/written in 8K chunks. If the user creates a huge chunk
then it is up to the user to deal with it.
Huge data chunks can be accessed with the command cfopen_chunk() followed
by reads and writes to the returned handle.
LAZY WRITING
The writethrough properties of an object can be set with the commands
cfsetlazy(handle) and cfsetverylazy(handle). The command cfclrlazy(handle)
causes the object and all underlying buffers, including the OS, to be flushed.
The default writethrough property of an object causes it to be up to date
at the user API level. By this I mean that the CFF buffers are written to
the OS buffers. In order to ensure that the OS buffers are flushed, the
programmer should issue cfflush(handle) and cfsync() commands when appropriate.
BITMAPS
The term 'bitmaps' is misleading. CFF uses extent maps which are 512 bytes
in size and contain 49 sorted entries per map. Each bitmap set has an
in-core sorted index. The root directory of each filesystem has a bitmap
set by default. Individual objects can be created with a bitmap which
will control space allocation for the object and it's subobjects. If an
object has a bitmap it can be deleted very quickly, the system merely returns
all the space defined by the maps to the parent maps. Without a bitmap the
deletion of a complex object can take a while. Nevertheless, DO NOT USE
BITMAPS unless you know what you are doing. This is because they lead to
fragmentation in any environment that involves active insertions and deletions.
Preferably you would pre-allocate all the anticipated space to the bitmap
when you create the object, but this usually means that you allocate a lot
more than is really needed. To find out more about bitmaps, issue the command
cfprintbitmaps(something). Hashed objects which have preallocated data and
entries do not return space to the controlling bitmaps when something
is deleted, this is good. To shrink a sparsely populated preallocated hashed
object, copy it.
CFF maintains two caches for each set of bitmaps, the KEYCACHE is used
for storage of hashed keys and tends to segregate long keys from the data,
this is good; the CACHE is used to dispense space for data, and nodes.
Bitmaps are stored in the space that they map; often as the first 512
bytes.
GARBAGE COLLECTION
The system actively returns space to the underlying bitmaps and
also coalesces the bitmaps in a timely manner. The system currently does not
compress an object or filesystem in place, i.e. rearrange the placement
of things so that the end of a bitmap contains a nice chunk of space that
can be returned to it's parent. cfcopy() is the rich man's substitute.
ERROR REPORTING
Version 5.1 has very uninformative error reporting.
ACCESS CONTROL and SECURITY
Version 5.1 has no access control or security.
FILE AND RECORD LOCKING
Version 5.1 has no locks.
DATA COMPRESSION
Version 5.1 has no compression algorithm.
INFORMATIONAL PRINTING
The programmer can set the system print function with the command
cfsetprintfunc(int (*funcptr)(int));
The print function prints one character at a time and returns 1 if OK
and -1 if a device error.
The default print function calls PORTPRINT in cfport.c which writes
one character to file descriptor 1. Unbuffered printing gets you all
there is to see even when the system aborts.
CFF contains a built in printf 'cfprintf' which is reentrant and
prints one character at a time.
cfprintbitmaps(void *something);
Prints the bitmaps of something (a path or handle) and it's parents.
cfprintentries(void *something);
Prints the contents of the storage maps of something (a path or handle).
cfpflags(char *label, void *handle)
Prints the flags for an open object and its parents,
adds a programmer supplied label string.
THE COMMAND SET
NOTATION:
void *something Denotes a path or handle (use either one).
If a path, the object need not be open.
void *handle Denotes a handle for an open object or data chunk.
INITIALIZATION AND EXIT
void cfinit(
char *appname, // the name of the application
int bufferspace, // size of buffer area in KB
char *permfile // path of file containing permanent objs
)
Initialize, load permanent malloc categories from permfile,
load application and system string definitions from permfile,
define PERMFILE, PERMCAT and PERMINFO to refer to the appropriate
objects in permfile.
void cfexit(
void
)
Save permanent malloc categories, save current application and system string
definitions, close all open objects.
OPEN CLOSE READ WRITE etc.
typedef struct opninfo {
long initial_entries; // if hashed object, preallocates buckets
unsigned long bitmap_prealloc; // if object has bitmap, initial space
long data_prealloc; // if hashed object, data space per initial entry
} OPNINFO;
/* OPEN MODE BITS */
#define F_RDONLY 0x0001 // The object is readonly, the parents are rdwr
#define F_WRONLY 0x0002 // I don't think this works
#define F_RDWR 0x0003
#define F_CREAT 0x0004 // create if non-existant
#define F_TEMP 0x0008 // delete on close
#define F_UNIQ 0x0010 // create a unique name and append to path
#define F_EXCL 0x0020 // non shared open
#define F_BITMAP 0x0040 // attach a bitmap
#define F_TRUNC 0x0080 // truncate on open
#define F_APPEND 0x0100 // only append when writing to file
#define F_FILEONLY 0x0400 // 512 byte nodes
#define F_BIGDIR 0x0800 // 4096 byte nodes
#define F_HUGEDIR 0x1000 // 8192 byte nodes
#define F_SORTED 0x8000 // B+ tree
#define F_UNTAGGED 0x10000 // items are untagged if B+ tree
#define F_STAT 0x20000 // The object and parents are readonly
void *cfopen(
char *path, // pathname of object
long mode, // open mode bits
void *info // pointer to OPNINFO struct or NULL
)
Returns an opaque handle or NULL, check errno.
void *cfsubopen(
void *handle, // handle of open object
void *name, // name of subobject
long mode, // open mode bits
void *info // pointer to OPNINFO struct or NULL
)
Pastes the path, this is a convenience for the programmer.
Returns an opaque handle or NULL, check errno.
void *cfopen_chunk(
void *handle, // handle of open object
void *item // pointer to an Item describing a chunk
)
Permits read/write access to a fixed length chunk of data.
Use cfclose() when finished.
Returns an opaque handle or NULL, check errno.
void cfclose(
void *handle
)
Closes and flushes whatever is referenced by the handle.
If an object or file has been created with F_TEMP, it is deleted on close.
void cfflush(
void *handle
)
Ensures that all information pertaining
to the object or external file is written out.
void cfsync()
Ensures that all information pertaining
to all open objects and external files is written out.
long cfread(
void *handle,
void *userbuffer,
long amount
)
Reads from a file or a chunk.
Returns amount read or ERROR, check errno.
long cfwrite(
void *handle,
void *userbuffer,
long amount
)
Writes to a file or a chunk.
Returns amount written or ERROR, check errno.
long cfseek(
void *handle,
unsigned long amount,
int mode
)
Seeks within a file or a chunk.
It is legal to seek past the end of a file.
CFF Version 5.1 files do not have holes.
Returns position or ERROR
/* Seek modes */
#define S_SET 100
#define S_CUR 200
#define S_END 300
int cftruncate(
void *something,
unsigned long size
)
Truncates the file property to 'size', returns OK or ERROR
SEQUENTIAL DIRECTORY ACCESS -- JUST LIKE POSIX almost
typedef struct cfdirent {
int d_namlen;
char *d_name;
unsigned long d_bytesalloc;
unsigned long d_bytesused;
unsigned long d_mode;
unsigned long d_entrycnt;
void *d_fpt;
} CFDIRENT;
void *cfopendir(
void *something, // path or handle
long *mode // if non NULL, receives mode bits
)
Opens the directory aspect of a path or handle.
Works for external files.
Returns opaque handle, or NULL, check errno.
void cfclosedir(
void *openhandle
)
Close using the handle produced by cfopendir.
CFDIRENT *cfreaddir(
void *openhandle
)
Returns a pointer to a CFDIRENT struct, or NULL if EOD.
void cfrewinddir(
void *openhandle
)
Reset to beginning of directory.
void cftelldir(
void *openhandle,
STOR *curentry
)
Returns a pointer to the current directory entry.
void cfseekdir(
void *openhandle,
STOR *curentry
)
Sets the directory search to the spot returned by cftelldir.
SEQUENTIAL INDEX ACCESS
long cfhead(
void *handle,
Item *itemptr
)
Goto beginning of entries, get current item.
Returns OK or ERROR,
long cfhead_dupnum(
void *handle
void *keyptr,
int keylen,
void *itemptr
)
Goto beginning of normal duplicate entries for the key, get current item.
Returns OK or ERROR
long cfhead_dupname(
void *handle
void *keyptr,
int keylen,
void *itemptr
)
Goto beginning of DupName duplicate entries for the key, get current item.
Returns OK or ERROR
long cftail(
void *handle,
Item *itemptr
)
Goto end of entries, get current item.
Returns OK or ERROR,
long cftail_dupnum(
void *handle
void *keyptr,
int keylen,
void *itemptr
)
Goto end of normal duplicate entries for the key, get current item.
Returns OK or ERROR
long cftail_dupname(
void *handle
void *keyptr,
int keylen,
void *itemptr
)
Goto end of DupName duplicate entries for the key, get current item.
Returns OK or ERROR
long cfnext(
void *handle,
Item *itemptr
)
Goto next sequential entry, get current item.
Returns OK, EOI or ERROR,
long cfnext_dupnum(
void *handle
void *keyptr,
int keylen,
void *itemptr
)
Goto next normal duplicate entry for the key, get current item.
Returns OK, EOI or ERROR
long cfnext_dupname(
void *handle
void *keyptr,
int keylen,
void *itemptr
)
Goto next DupName duplicate entry for the key, get current item.
Returns OK, EOI or ERROR
long cfprev(
void *handle,
Item *itemptr
)
Goto previous sequential entry, get current item.
Returns OK, BOI or ERROR,
long cfprev_dupnum(
void *handle
void *keyptr,
int keylen,
void *itemptr
)
Goto previous normal duplicate entry for the key, get current item.
Returns OK, BOI or ERROR
long cfprev_dupname(
void *handle
void *keyptr,
int keylen,
void *itemptr
)
Goto previous DupName duplicate entry for the key, get current item.
Returns OK, BOI or ERROR
long cfkey(
void *handle,
void *keybufptr,
int keybuflen
)
Returns OK, BOI, EOI or ERROR.
If OK, fills the buffer denoted by 'keybufptr' with no more than
'keybuflen' bytes of the current key.
long cfitem(
void *handle,
Item *itemptr
)
Get item from current position.
Returns OK, BOI, EOI or ERROR.
long cfdata(
void *handle,
void *databufptr,
int databuflen
)
Returns OK, BOI, EOI or ERROR if invalid handle or no data available.
If OK, fills the buffer denoted by 'databufptr' with no more than
'databuflen' bytes of the current key's data area, if it exists.
long cfkeylen(
void *handle,
int *len
)
returns OK if a current key exists, len is set to the length of the key
NOTE: Short hashed keys always return a length of 8, even if they
were originally shorter.
long cfdatalen(
void *handle,
int *len
)
returns OK if a current key exists, len is set to data size or 0 if no data
long cfmark(
void *handle
)
Saves the current position. (see cffind_mark()).
Returns OK or ERROR.
INSERT ITEMS
int cfinsert(
void *handle,
void *keyptr,
int keylen,
void *itemptr
)
Returns OK or ERROR, duplicate entries are not allowed
int cfreinsert(
void *handle,
void *keyptr,
int keylen,
void *itemptr
)
Returns OK or ERROR, the key must exist, duplicate entries are not allowed.
If the existing item references data, the data space is deleted.
int cfinsert_dupnum(
void *handle,
void *keyptr,
int keylen,
void *itemptr,
long *dupcnt // returns the current dupcnt, 1 based
)
Returns OK or ERROR, 'normal' duplicates are allowed.
int cfreinsert_dupnum(
void *handle,
void *keyptr,
int keylen,
void *itemptr,
long *dupnum // points to the desired dupnum, 0 based
)
Returns OK or ERROR, the key and dupnumn'th duplicate must exist.
If the existing item references data, the data space is deleted.
NOTE: Items entered as normal duplicates do not remain in the order
in which they are entered, B+ trees sort the keys and items,
hashed directories get rearranged when split. Therefore, reinserting
to a specific dupnum is a chancy business and should be done only
when the programmer is certain of the algorithm.
int cfinsert_dupname(
void *handle,
void *keyptr,
int keylen,
void *itemptr,
DupName *dupname // returns current dupname
)
Returns OK or ERROR, the value referenced by 'dupname' is filled with the
current DupName.
NOTE: DupNames are constantly incremented and provide a completely unique
way to identify a key-item pair. Each DupName contains a 48 bit counter
that is decremented only under special circumstances (see cfdelete_lastdupname).
If a program inserted 1000 dupnames per second, it would take more than 8000
years to overflow the counter.
int cfreinsert_dupname(
void *handle,
void *keyptr,
int keylen,
void *itemptr,
DupName *dupname // points to the desired DupName
)
If the existing item references data, the data space is deleted.
Returns OK or ERROR, the key and dupname'th duplicate must exist.
INSERT DATA
NOTE: 1. Data may not be inserted in objects with untagged items.
2. reput acts like realloc, if the entry doesn't exist it is created.
If dupnums or DupNames are used with reput, then the desired entry
must exist or the operation fails.
void *cfput(
void *handle,
void *keyptr,
int keylen,
void *databuffer,
long databuflen,
void *itemptr // if non NULL then the Item is filled in
)
Returns 'itemptr' if successful, NULL if not.
The Item referenced by 'itemptr' describes a data chunk.
If itemptr is NULL, then the non-NULL value returned on success is not
a valid itemptr in its own right.
void *cfreput(
void *handle,
void *keyptr,
int keylen,
void *databuffer,
long databuflen,
void *itemptr // if non NULL then the Item is filled in
)
Overwrites old data, if no old data, a new entry is created.
Returns 'itemptr' if successful, NULL if not.
The Item referenced by 'itemptr' describes a data chunk.
If itemptr is NULL, then the non-NULL value returned on success is not
a valid itemptr in its own right.
void *cfput_dupnum(
void *handle,
void *keyptr,
int keylen,
void *databuffer,
long databuflen,
void *itemptr,
long *dupcnt // returns the current dupcnt, 1 based
)
Returns 'itemptr' if successful, NULL if not.
The Item referenced by 'itemptr' describes a data chunk.
If itemptr is NULL, then the non-NULL value returned on success is not
a valid itemptr in its own right.
void *cfreput_dupnum(
void *handle,
void *keyptr,
int keylen,
void *databuffer,
long databuflen,
void *itemptr, // if non NULL then the Item is filled in
long *dupnum // references the desired dupnum, 0 based
)
Overwrites old data, if no old data, the operation fails.
Returns 'itemptr' if successful, NULL if not.
The Item referenced by 'itemptr' describes a data chunk.
If itemptr is NULL, then the non-NULL value returned on success is not
a valid itemptr in its own right.
void *cfput_dupname(
void *handle,
void *keyptr,
int keylen,
void *databuffer,
long databuflen,
void *itemptr, // if non NULL then the Item is filled in
DupName *dupname // returns the current DupName
)
Returns 'itemptr' if successful, NULL if not.
The Item referenced by 'itemptr' describes a data chunk.
If itemptr is NULL, then the non-NULL value returned on success is not
a valid itemptr in its own right.
void *cfreput_dupname(
void *handle,
void *keyptr,
int keylen,
void *databuffer,
long databuflen,
void *itemptr, // if non NULL the Item is filled in
DupName *dupname // references the desired DupName
)
Overwrites old data, if no old data, the operation fails.
Returns 'itemptr' if successful, NULL if not.
The Item referenced by 'itemptr' describes a data chunk.
If itemptr is NULL, then the non-NULL value returned on success is not
a valid itemptr in its own right.
RETRIEVE DATA
long cfget(
void *handle,
void *keyptr,
void keylen,
void *databufptr,
long databuflen
)
Read the data for the first item of the key into the buffer for no
more than 'databuflen' bytes;
Returns FOUND, FOUND+1 if dups present, NOTFOUND, or ERROR
if >= FOUND sets up for sequential access.
long cfget_dupnum(
void *handle,
void *keyptr,
int keylen,
void *databufptr,
long databuflen,
long *dupnum // references the desired dupnum, 0 based
)
Read the data for the dupnum'th item of the key into the buffer for no
more than 'databuflen' bytes;
Returns FOUND, FOUND+1, NOTFOUND or ERROR.
If >= FOUND, sets up for sequential access.
long cfget_dupname(
void *handle,
DupName *dupname, // points to the desired DupName
void *databufptr,
long databuflen
)
Read the data for the dupname into the buffer for no
more than 'databuflen' bytes;
Returns FOUND, NOTFOUND or ERROR.
If FOUND, sets up for sequential access.
FIND ITEMS
int cffind(
void *handle,
void *keyptr,
int keylen,
void *itemptr // if FOUND the item is returned
)
Locates the first item of a key, and sets up for sequential access.
Returns FOUND, FOUND+1 if dups present, NOTFOUND, or ERROR
int cffind_dupnum(
void *handle,
void *keyptr,
int keylen,
void *itemptr, // if FOUND the item is returned
long *dupnum // points to the desired dupnum, 0 based
)
Returns FOUND, FOUND+1, NOTFOUND or ERROR.
If >= FOUND, sets up for sequential access.
int cffind_dupname(
void *handle,
DupName *dupname // points to the desired dupname
void *itemptr, // if FOUND the item is returned
)
Returns FOUND, NOTFOUND or ERROR. If FOUND, sets up for sequential access.
int cffind_item(
void *handle,
void *keyptr,
int keylen,
void *itemptr
)
Locates a key-item pair.
Returns FOUND, NOTFOUND or ERROR. If FOUND, sets up for sequential access.
int cffind_mark(
void *handle,
void *itemptr
)
Returns the item for the current mark if the mark is valid (see cfmark()).
The mark can be invalidated if items are inserted or deleted after the
mark is set. This phenomenon happens more often with hashed directories
than with B+ trees.
Returns FOUND, NOTFOUND or ERROR. If FOUND, sets up for sequential access.
DELETE ITEMS
int cfdelete(
void *handle,
void *keyptr,
int keylen
)
The first item of the key is deleted along with any data.
Returns OK or ERRORNUM a negative number.
int cfdelete_item(
void *handle,
void *keyptr,
int keylen,
void *itemptr
)
If the key-item pair exists it is deleted along with any data.
Returns OK or ERRORNUM a negative number.
int cfdelete_dupnum(
void *handle,
void *keyptr,
int keylen,
long dupnum // 0 based
)
If the dupnum'th item for the key exists it is deleted along with any data.
Returns OK or ERRORNUM a negative number
int cfdelete_dupname(
void *handle,
void *keyptr,
int keylen,
DupName *dupname
)
If the dupname exists, it is deleted along with any data.
If the dupname is deleted, the 'holes' counter is incremented by one.
Each DupName set contains a 48 bit holes counter. (see cfcountdups()).
returns OK or ERRORNUM a negative number.
int cfdelete_lastdupname(
void *handle,
void *keyptr,
int keylen
)
If the last DupName for the key exists, it is deleted along with any data.
The DupName counter is decremented by one.
Returns OK or ERRORNUM a negative number.
int cfdelete_lastdupnum(
void *handle,
void *keyptr,
int keylen
)
If the last dupnum for the key exists, it is deleted along with any data.
WARNING: The last dupnum is probably not the last dup entered. B+ trees
sort all input and hashed directories are reorganized when split/coalesced.
Returns OK or ERRORNUM a negative number.
int cfdelete_alldupnames(
void *handle,
void *keyptr,
int keylen
)
All 'DupName' entries for the key are deleted along with any data.
Returns OK or ERROR
int cfdelete_alldupnums(
void *handle,
void *keyptr,
int keylen
)
All 'dupnum' entries for the key are deleted along with any data.
Returns OK or ERROR
DELETE OBJECTS
int cfunlink(
void *something, // path or object handle
... // optional 2'nd arg if arg 1 is a handle
)
Deletes the object denoted by 'something' and an optional second argument.
If 'something' is a path, then the target of the path is deleted.
if 'something' is a handle, then if the second argument is NULL
the object denoted by the handle is deleted. A non-NULL second argument
must be a character string which names an entry in the object's index.
The second argument must not refer to a sub_object.
Returns OK or ERROR if multiply opened or sub_objects open or not found.
NOTE: When an object is unlinked all of it's sub-objects are unlinked
and all data is returned to parent bitmaps. External files can be unlinked.
STACK OPERATIONS
long cfpush_item(
void *handle, // an open object
void *itemptr
)
Push the item on the objects stack.
Returns current stack depth or ERROR.
long cfpush_data(
void *handle, // an open object
void *datbufptr,
int datbuflen
)
Pushes data onto the objects stack (max 16MB).
Returns current stack depth or ERROR.
long cfpop_item(
void *handle, // an open object
void *itemptr
)
Pops an item from the top of the objects stack.
IF THE ITEM DESCRIBES DATA, THE DATA HAS NOT BEEN DELETED (use cfretspace()).
I have chosen this dangerous method of dealing with the stack because
it can be very useful. i.e. One part of a program can push a lot of data
without regard for where it is going or assigning a key to it, another
part of a program can pop the items which describe the data and save the
items for later keyed retrieval, the data moves once. This method is doubly
dangerous: First, the programmer must ensure that the data can eventually
be deleted and Second, if a hashed object contains PREALLOCATED ENTRIES
then the data should NEVER be deleted because the preallocation mechanism
will want to reuse the space, and indeed will overwrite the data at some
time in the future.
Returns current stack depth or ERROR.
long cfpop_data(
void *handle, // an open object
void *datbufptr,
int datbuflen
)
Pops data (if it exists) from the top of the stack.
If the top of the stack contains an item that does not describe data
the item is lost.
Returns current stack depth or ERROR.
long cfstackdepth(
void *handle // an open object
)
Returns current stack depth or ERROR.
SPACE ALLOCATION
void *cfgetspace(
void *handle, // an open object
long amount, // max 16MB
void *itemptr // receives an Item which describes the space
)
Returns 'itemptr' if success or NULL.
int cfretspace(
void *handle, // an open object
void *itemptr // pointer to an Item which describes space
)
Returns OK or ERROR
CFF Version 5.1 will abort if the Item does not describe valid space.
I think that this is better than returning an error.
MEMORY FUNCTIONS
void *malloc(size_t)
void *calloc(unsigned, unsigned)
void *realloc(void *, unsigned)
void *valloc(unsigned)
void *memalign(unsigned, unsigned)
void free(void *)
unsigned mallocsize(void *)
void *cfmalloc(unsigned, Item *)
void cffree(Item *)
MEMORY BY CATEGORY FUNCTIONS
void *mallocC(long, unsigned)
void *callocC(long, unsigned, unsigned)
void *reallocC(long, unsigned, unsigned)
void *vallocC(long, unsigned)
void *memalignC(long, unsigned, unsigned)
void freeC(long, void *)
void freecat(long)
CONTROL FUNCTIONS
long cfmodbufs(
long increment // positive or negative increment in K bytes
)
Changes the allowed buffering space by 'increment'.
Returns OK
long cfsetlazy(
void *handle // an open object
)
Changes the writethrough characteristic of the object to LAZY.
Some bookeeping entries are forced out.
Returns OK or ERROR
long cfsetverylazy(
void *handle // an open object
)
Changes the writethrough characteristic of the object to VERYLAZY.
No data is forced out.
Returns OK or ERROR
long cfclrlazy(
void *handle // an open object
)
Sets the writethrough characteristic of the object to NORMAL.
Forces out everything.
Returns OK or ERROR
long cfsetkeycmp(
void *handle, // open object
int (*func)(void *, int, void *, int) // function pointer
)
Changes the default key comparison routine for an object to a programmer
supplied version.
Returns OK or ERROR
long cfsetitemcmp(
void *handle, // open object
int (*func)(void *, void *) // function pointer
)
Changes the default item comparison routine for an object to a programmer
supplied version.
Returns OK or ERROR
long cfsetprintfunc(
int (*func)(int) // function pointer
)
Changes the default system print function to a programmer supplied version.
The function must print one character at a time,
and return 1 for success, -1 for error.
Returns OK
void cfport_settestflags(
int flags
)
Set flags in the portability module 'cfport.c'. This command is issued
before cfinit(). The only defined flag is 1, which causes the extended
memory driver to enable a 4Meg local buffer for testing purposes.
If the 1 flag is not set, the system will map extended memory to
primary memory unless the programmer has included a 'real' extended
memory driver in cfport.c
COPY OBJECTS, FILESYSTEMS and FILES
void *cfcopy(
void *something_dst, // designates the destination
void *something_src // designates the source
)
Copy the source to the destination.
The arguments can be paths or handles.
The source may be an object, a filesystem or an external file.
Ditto for the destination.
Sorted objects (created with F_SORTED) may not be copied to a filesystem.
If the source or destination is an external file, then the file property
of the object is copied.
Returns a handle to the OPENED destination or NULL.
NOTE: If the destination argument was an open handle, then the returned
handle SUPERCEEDS the original. i.e. the open object was deleted and
recreated.
DIRECT ACCESS TO DATA IN THE BUFFERS
void *cflocalize(
void *handle, // an open object
void *item // pointer to an Item which describes space
)
Returns a pointer to memory or NULL.
This command ties up one buffer header.
Be certain to release the buffer when finished.
The buffers are in the heap and thus are not memory protected.
void cfrelease(
void *memptr, // a pointer to memory returned by cflocalize()
long relmode // R_CLEAN, R_DIRTY, R_FLUSH
)
Release a buffer.
If the release mode is R_CLEAN, then the buffer may never be written out.
If the release mode is R_DIRTY, then the buffer will eventually be written out.
If the release mode is R_FLUSH, then the buffer is written out immediately if
the object writethrough condition is not VERYLAZY. If the object may be in
VERYLAZY mode, be sure to use (R_DIRTY|R_FLUSH).
NAME TRANSLATION FUNCTIONS
int cfdef(
char *keyptr, // the key for the definition string
char *defptr // the definition string, entered in local dictionary
)
Enters the definition string in the local dictionary, under the key.
Local definitions override application and system definitions.
Returns OK or ERROR
int cfundef(
char *keyptr // the key to a definition string
)
Deletes a definition string from the local dictionary.
returns OK or ERROR
int cfsysdef(
char *keyptr, // the key for the definition string
char *defptr // the definition string, entered in PERMFILE
)
Enters the definition string in the permanent system dictionary (if it exists).
PERMFILE is enabled if arg 3 of cfinit() mentions a valid '.cff' file.
Returns OK or ERROR
int cfsysundef(
char *keyptr // the key to a defined string
)
Deletes a definition string from the permanent system dictionary.
Returns OK or ERROR
int cfappdef(
char *keyptr, // the key for the definition string
char *defptr // the definition string, entered in PERMINFO
)
Enters the definition string in the permanent application dictionary.
Application definitions override system definitions.
PERMINFO is enabled if arg 3 of cfinit() mentions a valid '.cff' file.
Returns OK or ERROR
int cfappundef(
char *keyptr // the key to a defined string
)
Deletes a definition string from the permanent application dictionary.
Returns OK or ERROR
int cftrn(
char *input_string,
char **output_string
)
Translates the input string to the output string using the dictionaries.
The programmer must free the output string.
Returns OK or ERROR
int cfpathtrn(
char *input_string,
char **output_string
)
Translates the input string to a fully qualified path, using the dictionaries
and the current working directory.
The programmer must free the output string.
Returns 0 if internal object, 1 if external file, 2 if filesys, 3 if rawdevice
Returns ERROR if trouble.
int cfchdir(
char *newpath
)
Change the current working directory.
Returns OK or ERROR
NOTE: to get the current working directory:
{
char *cwd;
cfpathtrn(".", &cwd);
...
free(cwd);
}
NOTE: CFF does not include disk drive prefixes in the cwd, the programmer
may include them when opening a file or filesystem.
NOTE: The translator works mostly on the left hand side of a path;
it first tries to translate the whole input string, then it expands
the left hand side up to 10 times, then it tries to translate the
whole result string. Would a macro facility be helpful?
INFORMATIONAL FUNCTIONS
int cflastdupname(
void *handle,
void *keyptr,
int keylen,
DupName *dupname // filled if successful
)
If DupNames exist for the key, 'dupname' is filled with the last one.
Returns OK or ERROR
long cfcountdups(
void *handle,
void *keyptr,
int keylen
)
Returns the actual duplicate count for a key. If DupNames are being used
then the actual count is the last DupName minus the number of prior deletions.
Each DupName key has a deletion counter which is incremented for every
successful delete except cfdelete_lastdupname().
Normal duplicates are just overtly scanned and counted.
int cfstat(
void *something, // a handle or a path
void *stbuf // pointer to a CFSTAT struct
)
returns OK or ERROR
typedef struct cffstat {
unsigned long st_smhead;
unsigned long st_smtail;
unsigned short st_id;
unsigned short st_keysize;
STOR st_dups;
unsigned long st_bmhead;
unsigned long st_bmtail;
unsigned long st_mode;
short st_uid;
short st_gid;
long st_mtime;
long st_ctime;
unsigned long st_highleaf;
unsigned long st_size;
unsigned long st_alloc;
unsigned long st_entrycnt;
short st_mapsize;
unsigned short st_dupids;
long st_atime;
long st_filesize;
long st_filealloc;
long st_obtype;
unsigned int st_filedups;
long st_ino;
short st_blksize;
short st_dev;
short st_nlink;
short st_rdev;
} CFSTAT;
/* MODE BITS in st_mode */
#define M_ROOTDIR 0x80000000
#define M_FILEONLY 0x40000000
#define M_HASHDIR 0x20000000
#define M_TREEDIR 0x10000000
#define M_UNTAGGED 0x08000000
#define M_BITMAP 0x04000000
#define M_EXTRNFILE 0x02000000
#define M_PREALLOC 0x01000000
#define M_IFMT 0x000F0000
#define M_IFDIR 0x00004000
#define M_IFIFO 0x00002000
#define M_IFCHR 0x00001000
#define M_IFBLK 0x00003000
#define M_IFREG 0x00008000
#define M_IREAD 0x00000100
#define M_IWRITE 0x00000080
#define M_IEXEC 0x00000040
long cfentrycnt(
void *something
)
returns the entrycount of a handle or a path, or ERROR
long cfdepth(
void *handle
)
returns treedepth or ERROR
long cfbytesused(
void *handle
)
returns the bytes used in the object or ERROR
long cfbytesalloc(
void *handle
)
returns the bytes allocated to the object or ERROR
long cftotalloc(
void *something,
unsigned long *used,
unsigned long *alloc
)
returns OK if something exists, sets used and alloc to the total space
allocated to the object and all of it's subobjects in 1000's of bytes.
long cfstackdepth(
void *handle
)
returns the current stackdepth of the object or ERROR
long cfcurbufs(
void
)
returns the current allowed localizer buffer space in K bytes.
long cfisnew(
void *handle
)
returns 1 if object is newly created, 0 if not, ERROR if invalid handle
long cffilesize(
void *handle
)
returns the size of the file property of the object or ERROR
long cffilealloc(
void *handle
)
returns the space allocated to the file property or ERROR
long cfprealloc(
void *handle
)
returns the size of each preallocated chunk or 0 or ERROR
long cfmapsize(
void *handle
)
returns the node size of the object or ERROR
long cfalignment(
void *handle
)
returns the alignment for the object (32 is hardwired in version 5.1)
long cfissorted(
void *handle
)
returns 1 if the object is sorted, 0 if not or ERROR
void cfprintbitmaps(
void *something // a path or handle
)
Prints the bitmaps for the target object and it's parents.
void cfprintentries(
void *something // a path or handle
)
Prints all the entries in the target object's nodes.
long cfhash(
void *keyptr,
int keylen,
CAT *cat // pointer to a CAT structure
)
Calls the system hash function for the supplied key. The CAT structure
receives the hashed output. Useful for generating random numbers etc.
Returns OK.
long cfobtype(
void *something
)
returns the OB bits or ERROR
#define OB_SHARE 0x00000001 // object can be opened more than once
#define OB_ISDIR 0x00000002 // object is a directory
#define OB_BMOK 0x00000004 // bitmaps loaded
#define OB_SMOK 0x00000008 // storage maps loaded
#define OB_MEM 0x00000010 // memory object
#define OB_RAWDEV 0x00000020 // raw device
#define OB_CFILE 0x00000040 // part of a disk filesys
#define OB_SETUP 0x00000080 // object is setup
#define OB_FOD 0x00000100 // on a file oriented device
#define OB_ROOTDIR 0x00000200 // object is the root directory
#define OB_DIRTY 0x00000400 // has been written to
#define OB_DELCLOSE 0x00000800 // delete on close
#define OB_WRITE 0x00001000 // ok to write to this object
#define OB_BITMAP 0x00002000 // has a bitmap
#define OB_XFILE 0x00004000 // is an external OS file
#define OB_ISNEW 0x00008000 // newly created
#define OB_SMEM 0x00010000 // extended memory object
#define OB_FILEONLY 0x40000000 // created with F_FILEONLY
#define OB_HASHDIR 0x20000000 // uses hashed maps
#define OB_TREEDIR 0x10000000 // uses B+ tree maps
#define OB_UNTAGGED 0x08000000 // contains untagged items
#define OB_PREALLOC 0x01000000 // contains prealloced data and entries